Study of jacobian compensation using linear transformation of conventional MFCC for VTLN
نویسندگان
چکیده
In this paper, we present a linear transformation (LT) to obtain warped features from unwarped features during vocal-tract length normalisation (VTLN). This LT between the warped and unwarped features is obtained within the conventional MFCC framework without any modification in the signal processing steps involved during the feature extraction stage. Further using the proposed LT, we study the effect of the Jacobian on the VTLN performance and show that it provides additional improvement in the recognition performance. The Jacobian of the proposed LT is simply the determinant of the LT matrix. Jacobian compensation is not done in conventional VTLN as the relation between warped and unwarped features is not known. We also study the effect of cepstral variance normalisation (CVN), which is often used as an approximation for Jacobian compensation in conventional VTLN. We show that the proposed Jacobian compensation gives better or comparable performance when compared to CVN.
منابع مشابه
A study on the influence of covariance adaptation on jacobian compensation in vocal tract length normalization
In this paper, we first show that accounting for Jacobian in Vocal-Tract Length Normalization (VTLN) will degrade the performance when there is a mismatch between the train and test speaker conditions. VTLN is implemented using our recently proposed approach of linear transformation of conventional MFCC, i.e. a feature transformation. In this case, Jacobian is simply the determinant of the line...
متن کاملRevisiting VTLN using linear transformation on conventional MFCC
In this paper, we revisit the linear transformation for VTLN on conventional MFCC proposed by Sanand et al. in [1], using the idea of band-limited interpolation. The filter-bank is modified to include half-filters at zero and nyquist frequencies, as the full symmetric spectrum is required for performing bandlimited interpolation. In this paper, we show that the filter-bank with half-filters doe...
متن کاملImplementing frequency-warping and VTLN through linear transformation of conventional MFCC
In this paper, we show that frequency-warping (including VTLN) can be implemented through linear transformation of conventional MFCC. Unlike the Pitz-Ney [1] continuous domain approach, we directly determine the relation between frequency-warping and the linear-transformation in the discrete-domain. The advantage of such an approach is that it can be applied to any frequency-warping and is not ...
متن کاملFrequency warping for VTLN and speaker adaptation by linear transformation of standard MFCC
Vocal Tract Length Normalization (VTLN) for standard filterbank-based Mel Frequency Cepstral Coefficient (MFCC) features is usually implemented by warping the center frequencies of the Mel filterbank, and the warping factor is estimated using the maximum likelihood score (MLS) criterion (Lee and Rose, 1998). A linear transform (LT) equivalent for frequency warping (FW) would enable more efficie...
متن کاملUsing VTLN matrices for rapid and computationally-efficient speaker adaptation with robustness to first-pass transcription errors
In this paper, we propose to combine the rapid adaptation capability of conventional Vocal Tract Length Normalization (VTLN) with the computational efficiency of transform-based adaptation such as MLLR or CMLLR. VTLN requires the estimation of only one parameter and is, therefore, most suited for the cases where there is little adaptation data (i.e. rapid adaptation). In contrast, transform-bas...
متن کامل